Goto

Collaborating Authors

 material synthesis


Benchmarking large language models for materials synthesis: the case of atomic layer deposition

Yanguas-Gil, Angel, Dearing, Matthew T., Elam, Jeffrey W., Jones, Jessica C., Kim, Sungjoon, Mohammad, Adnan, Nguyen, Chi Thang, Sengupta, Bratin

arXiv.org Artificial Intelligence

In this work we introduce an open-ended question benchmark, ALDbench, to evaluate the performance of large language models (LLMs) in materials synthesis, and in particular in the field of atomic layer deposition, a thin film growth technique used in energy applications and microelectronics. Our benchmark comprises questions with a level of difficulty ranging from graduate level to domain expert current with the state of the art in the field. Human experts reviewed the questions along the criteria of difficulty and specificity, and the model responses along four different criteria: overall quality, specificity, relevance, and accuracy. We ran this benchmark on an instance of OpenAI's GPT-4o. The responses from the model received a composite quality score of 3.7 on a 1 to 5 scale, consistent with a passing grade. However, 36% of the questions received at least one below average score. An in-depth analysis of the responses identified at least five instances of suspected hallucination. Finally, we observed statistically significant correlations between the difficulty of the question and the quality of the response, the difficulty of the question and the relevance of the response, and the specificity of the question and the accuracy of the response as graded by the human experts. This emphasizes the need to evaluate LLMs across multiple criteria beyond difficulty or accuracy.


Material synthesis through simulations guided by machine learning: a position paper

Syed, Usman, Cunico, Federico, Khan, Uzair, Radicchi, Eros, Setti, Francesco, Speghini, Adolfo, Marone, Paolo, Semenzin, Filiberto, Cristani, Marco

arXiv.org Artificial Intelligence

In this position paper, we propose an approach for sustainable data collection in the field of optimal mix design for marble sludge reuse. Marble sludge, a calcium-rich residual from stone-cutting processes, can be repurposed by mixing it with various ingredients. However, determining the optimal mix design is challenging due to the variability in sludge composition and the costly, time-consuming nature of experimental data collection. Also, we investigate the possibility of using machine learning models using meta-learning as an optimization tool to estimate the correct quantity of stone-cutting sludge to be used in aggregates to obtain a mix design with specific mechanical properties that can be used successfully in the building industry. Our approach offers two key advantages: (i) through simulations, a large dataset can be generated, saving time and money during the data collection phase, and (ii) Utilizing machine learning models, with performance enhancement through hyper-parameter optimization via meta-learning, to estimate optimal mix designs reducing the need for extensive manual experimentation, lowering costs, minimizing environmental impact, and accelerating the processing of quarry sludge. Our idea promises to streamline the marble sludge reuse process by leveraging collective data and advanced machine learning, promoting sustainability and efficiency in the stonecutting sector.


Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval

Lee, Daeun, Choi, Jaewoong, Mizuseki, Hiroshi, Lee, Byungju

arXiv.org Artificial Intelligence

Recent studies have increasingly applied natural language processing (NLP) to automatically extract experimental research data from the extensive battery materials literature. Despite the complex process involved in battery manufacturing -- from material synthesis to cell assembly -- there has been no comprehensive study systematically organizing this information. In response, we propose a language modeling-based protocol, Text-to-Battery Recipe (T2BR), for the automatic extraction of end-to-end battery recipes, validated using a case study on batteries containing LiFePO4 cathode material. We report machine learning-based paper filtering models, screening 2,174 relevant papers from the keyword-based search results, and unsupervised topic models to identify 2,876 paragraphs related to cathode synthesis and 2,958 paragraphs related to cell assembly. Then, focusing on the two topics, two deep learning-based named entity recognition models are developed to extract a total of 30 entities -- including precursors, active materials, and synthesis methods -- achieving F1 scores of 88.18% and 94.61%. The accurate extraction of entities enables the systematic generation of 165 end-toend recipes of LiFePO4 batteries. Our protocol and results offer valuable insights into specific trends, such as associations between precursor materials and synthesis methods, or combinations between different precursor materials. We anticipate that our findings will serve as a foundational knowledge base for facilitating battery-recipe information retrieval. The proposed protocol will significantly accelerate the review of battery material literature and catalyze innovations in battery design and development.


Learning material synthesis-process-structure-property relationship by data fusion: Bayesian Coregionalization N-Dimensional Piecewise Function Learning

Kusne, A. Gilad, McDannald, Austin, DeCost, Brian

arXiv.org Artificial Intelligence

Autonomous materials research labs require the ability to combine and learn from diverse data streams. This is especially true for learning material synthesis-process-structure-property relationships, key to accelerating materials optimization and discovery as well as accelerating mechanistic understanding. We present the Synthesis-process-structure-property relAtionship coreGionalized lEarner (SAGE) algorithm. A fully Bayesian algorithm that uses multimodal coregionalization to merge knowledge across data sources to learn synthesis-process-structure-property relationships. SAGE outputs a probabilistic posterior for the relationships including the most likely relationships given the data.


Autonomous synthesis of metastable materials

Ament, Sebastian, Amsler, Maximilian, Sutherland, Duncan R., Chang, Ming-Chiang, Guevarra, Dan, Connolly, Aine B., Gregoire, John M., Thompson, Michael O., Gomes, Carla P., van Dover, R. Bruce

arXiv.org Artificial Intelligence

Autonomous experimentation enabled by artificial intelligence (AI) offers a new paradigm for accelerating scientific discovery. Non-equilibrium materials synthesis is emblematic of complex, resource-intensive experimentation whose acceleration would be a watershed for materials discovery and development. The mapping of non-equilibrium synthesis phase diagrams has recently been accelerated via high throughput experimentation but still limits materials research because the parameter space is too vast to be exhaustively explored. We demonstrate accelerated synthesis and exploration of metastable materials through hierarchical autonomous experimentation governed by the Scientific Autonomous Reasoning Agent (SARA). SARA integrates robotic materials synthesis and characterization along with a hierarchy of AI methods that efficiently reveal the structure of processing phase diagrams. SARA designs lateral gradient laser spike annealing (lg-LSA) experiments for parallel materials synthesis and employs optical spectroscopy to rapidly identify phase transitions. Efficient exploration of the multi-dimensional parameter space is achieved with nested active learning (AL) cycles built upon advanced machine learning models that incorporate the underlying physics of the experiments as well as end-to-end uncertainty quantification. With this, and the coordination of AL at multiple scales, SARA embodies AI harnessing of complex scientific tasks. We demonstrate its performance by autonomously mapping synthesis phase boundaries for the Bi$_2$O$_3$ system, leading to orders-of-magnitude acceleration in establishment of a synthesis phase diagram that includes conditions for kinetically stabilizing $\delta$-Bi$_2$O$_3$ at room temperature, a critical development for electrochemical technologies such as solid oxide fuel cells.